Predicting Workers’ Compensation Insurance Fraud Using SAS Enterprise Miner 5.1 and SAS Text Miner

نویسنده

  • Terry J. Woodfield
چکیده

Insurance fraud costs the property and casualty insurance industry over 25 billion dollars (USD) annually. This paper addresses workers' compensation claim fraud. A data mining approach is adopted, and issues of data preparation are discussed. The focus is on building predictive models to score an open claim for a propensity to be fraudulent. A key component to modeling is the use of textual data to enhance predictive accuracy. Binary response modeling is emphasized, but a two-stage modeling approach is briefly addressed. SAS Enterprise Miner 5.1 with SAS Text Miner provides the modeling environment for the analysis. INTRODUCTION Insurance fraud costs the property and casualty insurance industry over 25 billion dollars (USD) annually. Workers’ compensation insurance alone accounts for a sizable portion of this total cost. Workers’ compensation fraud can be divided into three types: • Claimant fraud occurs when an individual falsely claims to have experienced a work-related injury or exaggerates a legitimate injury and files for insurance benefits. • Provider fraud occurs when a provider bills an insurance company for services that were not provided. • Internal fraud occurs when someone inside the insurance organization creates a fictitious claim to route benefits to an accomplice. Sometimes two or more individuals conspire to commit fraud, which leads to fraud cases that have elements of two or three of the possible types of fraud. For example, a claimant and a provider might conspire to commit fraud. A standard operational practice is to have claims adjustors or claims supervisors identify potential fraud cases and route them to a Special Investigative Unit (SIU). The “traditional” SIU employs professionals who have skills in fraud investigation such as surveillance, evidence collection, and evidence analysis. These SIU investigators often have a law enforcement background. One shortcoming of the traditional SIU organization is the absence of computer data processing and predictive modeling professionals. Consequently, details of fraud investigations are rarely stored in electronic form in a modern database. This paper outlines a strategy for detecting claimant fraud in workers’ compensation insurance. It is assumed that a traditional referral process and SIU organization exist. Suggestions are made to help an organization transition to a modern data mining approach to fraud detection. While the primary impact will be on the referral process, there are implications for improved data processing to facilitate future predictive modeling. INSURANCE FRAUD CONTRASTED WITH OTHER FORMS OF FRAUD Modern success stories for data-mining fraud detection include applications in credit-card fraud and telecommunications fraud, which are much easier to model than insurance fraud. Credit-card and telecommunications fraud is easily identified within a relatively short period of time because a customer contests a bill or reports a stolen credit card. Essentially, the absence of a payment is a key component of fraud in credit and telecommunications applications. With insurance fraud, if the fraudulent behavior is not discovered, the insurer never knows that the fraud has occurred. Consequently, an uninvestigated claim cannot be labeled with respect to fraud. Only claims that are investigated by the SIU can be labeled FRAUD=YES or FRAUD=NO. This implies that, out of hundreds of thousands of claims that have detailed claim information, only about 1% can be used to train a predictive model. Contrast this to credit-card fraud where all cases can be included in modeling. Part of a proposed solution to the insurance fraud problem is to process the uninvestigated insurance claims so that some of these cases can also be used in modeling. SUGI 30 Data Mining and Predictive Modeling

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predictive Modeling in the Insurance Industry Using SAS Software

Insurance companies, third party insurance administrators, state insurance funds, state regulatory agencies, insurance industry consultants, and application service providers (ASP) use SAS software in a variety of predictive modeling situations. SAS Enterprise Miner provides tools for modeling continuous responses, such as monetary losses and time off work, and discrete responses, such as fraud...

متن کامل

Feature-based Sentiment Analysis on Android App Reviews Using SAS® Text Miner and SAS® Sentiment Analysis Studio

Sentiment analysis is a popular technique for summarizing and analyzing consumers’ textual reviews about products and services. There are two major approaches for performing sentiment analysis; statistical model based approaches and Natural Language Processing (NLP) based approaches to create rules. In this study, we first apply text mining to summarize users’ reviews of Android Apps and extrac...

متن کامل

Investigating Open Source Project Success: A Data Mining Approach to Model Formulation, Validation and Testing

This paper demonstrates the use of Data Mining (DM) techniques in exploratory research. A robust model for identifying the factors that explain the success of Open Source Software (OSS) projects is created, validated and tested. The predictive modeling techniques of Logistic Regression (LR), Decision Trees (DT) and Neural Networks (NN) are used together in this analysis. Using Text Mining resul...

متن کامل

155-2008: Cool New Features in SAS® Enterprise MinerTM 5.3

SAS released Enterprise Miner 5.3 in late 2007 with a veritable plethora of cool new features for data miners everywhere. Nearly every module of the software has been updated. New interactive data preparation tools make it easier to manipulate data and construct a sample for mining. For data exploration, Enterprise Miner now supports hierarchical market baskets to isolate interesting rules at d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005